Menu Top
Complete Course of Mathematics
Topic 1: Numbers & Numerical Applications Topic 2: Algebra Topic 3: Quantitative Aptitude
Topic 4: Geometry Topic 5: Construction Topic 6: Coordinate Geometry
Topic 7: Mensuration Topic 8: Trigonometry Topic 9: Sets, Relations & Functions
Topic 10: Calculus Topic 11: Mathematical Reasoning Topic 12: Vectors & Three-Dimensional Geometry
Topic 13: Linear Programming Topic 14: Index Numbers & Time-Based Data Topic 15: Financial Mathematics
Topic 16: Statistics & Probability


Content On This Page
Frequency and Frequency Distribution Ungrouped Frequency Distribution Table Grouped Frequency Distribution Table (Class Intervals, Limits, Class Size)
Cumulative Frequency and Cumulative Frequency Distribution Table


Frequency Distributions: Tables and Types



Frequency and Frequency Distribution


Frequency

In statistics, the frequency of a particular data value (or observation) is the number of times that value appears in the dataset. It indicates how often a specific observation occurs within the collected data.

Frequency can be determined for individual values in both quantitative and qualitative datasets.

Example 1. The marks obtained by 10 students in a short quiz are: 8, 5, 7, 8, 6, 9, 7, 8, 6, 8. Find the frequency of the mark '8'.

Answer:

Let's count how many times the value '8' appears in the list:

8, 5, 7, 8, 6, 9, 7, 8, 6, 8

The value '8' appears 4 times.

Therefore, the frequency of the mark '8' is 4.

Example 2. A small survey recorded the mode of transport used by 12 employees to get to work: Car, Train, Car, Bus, Train, Car, Car, Bus, Train, Car, Bus, Car. Find the frequency of 'Car'.

Answer:

Let's count how many times 'Car' appears in the list:

Car, Train, Car, Bus, Train, Car, Car, Bus, Train, Car, Bus, Car

The category 'Car' appears 6 times.

Therefore, the frequency of using 'Car' as the mode of transport is 6.


Frequency Distribution

A frequency distribution is a tabular or graphical representation that displays the frequencies of various data values or categories in a dataset. It organizes raw data into a structured format by showing how often each distinct value or group of values occurs.

Essentially, a frequency distribution is a summary that shows the pattern of variation in the data. It helps in understanding the concentration of values, the range, and the overall shape of the data distribution.

A frequency distribution can be presented in different forms:

The main purpose is to condense large amounts of raw data into a more easily understandable and analyzable format.

Example 1. Based on the marks data from Example 1 under 'Frequency' (8, 5, 7, 8, 6, 9, 7, 8, 6, 8), create a frequency distribution.

Answer:

We list each distinct mark and its frequency:

Mark Frequency
51
62
72
84
91
Total10

This table shows the frequency distribution of the marks obtained by the 10 students.


Ungrouped Frequency Distribution Table

An ungrouped frequency distribution table (also known as a simple frequency distribution table) is a method of organizing raw data by listing every distinct value that appears in the dataset and recording the number of times each value occurs. This table is suitable when the variable is discrete and has a small number of unique values, or when dealing with a small dataset where listing each value's frequency is manageable.

This table directly shows the frequency of each individual data point present in the dataset.


Construction of an Ungrouped Frequency Distribution Table

To construct an ungrouped frequency distribution table from a set of raw data, follow these steps:

  1. Identify Distinct Values: Examine the raw dataset and list all the unique values that appear in it. Arrange these values in ascending order (from the smallest to the largest) in the first column of your table.
  2. Tally the Frequencies (Optional but Recommended): Create a second column for tally marks. Go through the raw data observation by observation. For each observation, make a tally mark (|) next to the corresponding distinct value in the table. To make counting easier, group tally marks in bundles of five; the fifth mark is a diagonal line crossing the previous four ($\bcancel{||||}$).
  3. Count Frequencies: Count the total number of tally marks for each distinct value. This count represents the frequency ($f$) of that value. Record these frequencies in a third column labelled "Frequency".
  4. Calculate Total Frequency: Sum up all the frequencies in the "Frequency" column. This total sum ($\sum f$) should be equal to the total number of observations in the original raw dataset ($\text{N}$). This step serves as a check to ensure all observations have been counted correctly.

Example 1. The number of goals scored by a football team in 20 matches are: 2, 3, 0, 1, 4, 2, 1, 0, 3, 2, 4, 1, 1, 0, 2, 2, 3, 1, 2, 0. Construct an ungrouped frequency distribution table.

Answer:

The distinct values (number of goals scored) are 0, 1, 2, 3, and 4. We list these in ascending order.

Now, we tally the frequency of each score from the raw data:

  • 0 goals: Appears 4 times (0, 0, 0, 0) - $||||$
  • 1 goal: Appears 5 times (1, 1, 1, 1, 1) - $\bcancel{||||}$
  • 2 goals: Appears 6 times (2, 2, 2, 2, 2, 2) - $\bcancel{||||}\space |$
  • 3 goals: Appears 3 times (3, 3, 3) - $|||$
  • 4 goals: Appears 2 times (4, 4) - $||$

Now, we put this into a table:

Goals Scored (x) Tally Marks Frequency (f) (No. of Matches)
0$||||$4
1$\bcancel{||||}$5
2$\bcancel{||||}\space |$6
3$|||$3
4$||$2
Total20

The sum of frequencies is $4 + 5 + 6 + 3 + 2 = 20$, which matches the total number of matches given. The table is correctly constructed.

Example 2. A survey asked 15 students how many pets they own. The responses were: 1, 0, 2, 1, 1, 3, 0, 2, 1, 0, 1, 2, 2, 1, 3. Construct an ungrouped frequency distribution table for the number of pets owned by students.

Answer:

The distinct values (number of pets) are 0, 1, 2, and 3. We list these in ascending order.

Tallying the frequencies:

  • 0 pets: Appears 3 times (0, 0, 0) - $|||$
  • 1 pet: Appears 6 times (1, 1, 1, 1, 1, 1) - $\bcancel{||||}\space |$
  • 2 pets: Appears 4 times (2, 2, 2, 2) - $||||$
  • 3 pets: Appears 2 times (3, 3) - $||$
Number of Pets (x) Tally Marks Frequency (f) (No. of Students)
0$|||$3
1$\bcancel{||||}\space |$6
2$||||$4
3$||$2
Total15

The sum of frequencies is $3 + 6 + 4 + 2 = 15$, which matches the total number of students given. The table is correctly constructed.




Grouped Frequency Distribution Table (Class Intervals, Limits, Class Size)

When dealing with a large number of observations or when the data is continuous (meaning it can take any value within a given range), listing each individual value and its frequency becomes impractical and doesn't help in summarizing the data effectively. In such cases, the data is organized into groups or classes. A grouped frequency distribution table is then constructed, which shows the frequency of observations falling within each predefined group or class.


Need for Grouping Data

Organizing data into groups is essential for several reasons:


Key Components of a Grouped Frequency Distribution

Understanding the terminology used in grouped frequency distributions is crucial:


Types of Class Intervals

Class intervals can be defined using two primary methods:


Construction of a Grouped Frequency Distribution Table

Follow these steps to construct a grouped frequency distribution table from raw data:

  1. Determine the Range of the Data: Calculate the difference between the highest and lowest values in the raw data set.

    Range = Maximum Value - Minimum Value

  2. Decide the Number of Classes ($k$): Choose an appropriate number of class intervals. While there's no strict rule, typically $5$ to $15$ classes are used. Too few classes oversimplify the data; too many classes defeat the purpose of grouping. Sturges' rule provides a guideline:

    $\text{Number of Classes} (k) \approx 1 + 3.322 \log_{10} N$

    ... (1)

    where $N$ is the total number of observations. The result should be rounded up to the next whole number.

  3. Determine the Class Size / Width ($h$): Calculate the approximate width of each class interval by dividing the range by the chosen number of classes.

    $\text{Approximate Class Size} (h) \approx \frac{\text{Range}}{\text{Number of Classes}}$

    Round this value up to a convenient number (e.g., 5, 10, 20, 100) to make the class limits easy to work with. It is best to keep the class size constant for all intervals.

  4. Set up Class Limits or Boundaries: Choose a starting value for the first class. This value should be equal to or slightly less than the minimum value in the data. Then, determine the upper limit/boundary using the chosen class size and the chosen method (exclusive or inclusive). Continue setting up subsequent class intervals until the maximum value of the data is included in the last class.
  5. Tally the Observations: Go through each observation in the raw data. For each observation, place a tally mark ($|$) in the row of the class interval to which it belongs. Remember the rule for exclusive intervals (upper limit excluded) and inclusive intervals (both limits included). Use blocks of five tally marks for easier counting ($\bcancel{||||}$).
  6. Count the Frequency: Count the number of tally marks in each row to get the frequency ($f$) for each class interval.
  7. Sum the Frequencies: Add up the frequencies of all class intervals. This sum should be equal to the total number of observations ($N$). If it's not, recheck the tallying and counting process.

Example

Example 1. The weights (in kg) of 30 students are given below. Construct a grouped frequency distribution table using exclusive class intervals of size 5, starting from 40 kg.

45526148556370516068
58424957656259465366
41566450695447586156

Answer:

Given: Raw data of weights (in kg) of 30 students.

To Construct: Grouped frequency distribution table with exclusive classes of size 5, starting from 40 kg.

Solution:

The minimum value in the data is $41$ kg and the maximum value is $70$ kg.

Range = Maximum Value - Minimum Value = $70 - 41 = 29$ kg.

The desired class size is $h=5$.

The starting point for the first class is $40$. Since exclusive intervals are required (where the upper limit is excluded from the class), the class intervals will be:

  • $40 - 45$ (includes values $\geq 40$ and $< 45$)
  • $45 - 50$ (includes values $\geq 45$ and $< 50$)
  • $50 - 55$ (includes values $\geq 50$ and $< 55$)
  • $55 - 60$ (includes values $\geq 55$ and $< 60$)
  • $60 - 65$ (includes values $\geq 60$ and $< 65$)
  • $65 - 70$ (includes values $\geq 65$ and $< 70$)
  • $70 - 75$ (includes values $\geq 70$ and $< 75$)

We continue creating intervals until the maximum value (70) is included, which falls in the 70-75 interval.

Now, we tally the given 30 observations by placing a tally mark ($|$) in the appropriate class interval. We then count the tally marks to find the frequency ($f$) for each class.

Weight (kg)
(Exclusive Class Intervals)
Tally Marks Frequency (f)
(No. of Students)
40 - 45$||$2
45 - 50$\bcancel{||||}$5
50 - 55$\bcancel{||||}$5
55 - 60$\bcancel{||||} \ ||$7
60 - 65$\bcancel{||||} \ |$6
65 - 70$||||$4
70 - 75$|$1
Total30

The sum of frequencies is $2+5+5+7+6+4+1 = 30$, which equals the total number of students, confirming the tallying is correct.



Cumulative Frequency and Cumulative Frequency Distribution Table

In addition to knowing how many observations fall within each class interval (frequency), it is often useful to know the total number of observations that fall below or above a certain value. This is where the concept of cumulative frequency comes into play.

Cumulative frequency refers to the running total of frequencies. It helps in understanding the distribution of data by showing how many observations are less than or equal to a particular value, or greater than or equal to a particular value.


Types of Cumulative Frequency

There are two main types of cumulative frequency, depending on whether we are accumulating frequencies from the lower end or the upper end of the distribution:

  1. Less Than Cumulative Frequency (cf):

    The 'less than' cumulative frequency for a class interval is the sum of the frequencies of that class and all classes preceding it. It indicates the total count of observations whose values are less than the upper boundary of the respective class interval.

    To calculate the 'less than' cumulative frequency:

    • Start with the frequency of the first class as the 'less than' cumulative frequency for that class (or for the upper boundary of that class).
    • For each subsequent class, add its frequency to the cumulative frequency of the previous class.
    • The 'less than' cumulative frequency for the last class (or its upper boundary) will be equal to the total number of observations ($N$).

    ('Less Than' CF of current class) = ('Less Than' CF of previous class) + (Frequency of current class)

  2. More Than Cumulative Frequency:

    The 'more than' cumulative frequency for a class interval is the sum of the frequencies of that class and all classes succeeding it. It indicates the total count of observations whose values are greater than or equal to the lower boundary of the respective class interval.

    To calculate the 'more than' cumulative frequency:

    • Start with the total number of observations ($N$) as the 'more than' cumulative frequency for the lower boundary of the first class (since all observations are greater than or equal to the first lower boundary).
    • For each subsequent class (using its lower boundary), subtract the frequency of the *previous* class from the cumulative frequency calculated for the previous class's lower boundary.
    • The 'more than' cumulative frequency for the lower boundary of the last class will be equal to the frequency of the last class.

    ('More Than' CF for current lower boundary) = ('More Than' CF for previous lower boundary) - (Frequency of previous class)

    Alternatively, sum frequencies starting from the bottom-most class upwards.


Cumulative Frequency Distribution Table

A cumulative frequency distribution table presents the cumulative frequencies against the corresponding class boundaries or limits.

1. Less Than Cumulative Frequency Distribution Table

This table typically lists the upper boundaries of the class intervals and their corresponding 'less than' cumulative frequencies.

Using the weight data from Example 1 (from section I3) with frequencies 2, 5, 5, 7, 6, 4, 1 and Total N=30:

Weight (kg)
(Less than Upper Boundary)
Frequency (f) Less Than Cumulative Frequency (cf)
Less than 4522
Less than 505$2 + 5 = 7$
Less than 555$7 + 5 = 12$
Less than 607$12 + 7 = 19$
Less than 656$19 + 6 = 25$
Less than 704$25 + 4 = 29$
Less than 751$29 + 1 = 30$
Total30

Interpretation: From this table, we can see, for example, that 12 students have weights less than 55 kg, and 30 students have weights less than 75 kg (which is the total number of students).


2. More Than Cumulative Frequency Distribution Table

This table typically lists the lower boundaries of the class intervals and their corresponding 'more than' cumulative frequencies.

Using the weight data from Example 1 with frequencies 2, 5, 5, 7, 6, 4, 1 and Total N=30:

Weight (kg)
(More than or equal to Lower Boundary)
Frequency (f) More Than Cumulative Frequency
More than or equal to 40230
More than or equal to 455$30 - 2 = 28$
More than or equal to 505$28 - 5 = 23$
More than or equal to 557$23 - 5 = 18$
More than or equal to 606$18 - 7 = 11$
More than or equal to 654$11 - 6 = 5$
More than or equal to 701$5 - 4 = 1$
Total30

Interpretation: From this table, we can see, for example, that 18 students have weights of 55 kg or more, and 1 student has a weight of 70 kg or more.


Importance of Cumulative Frequency

Cumulative frequency distributions are fundamental tools in statistics for: